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Abstract. Let X be a hyperelliptic curve of genus g over Q, and let Xp 
denote its reduction modulo p. We describe a deterministic algorithm that 
simultaneously computes the zeta functions of Xp for all primes p < N of 
good reduction, whose average complexity per prime is polynomial in g and 
log AT. 



1. Introduction 

A central problem in computational arithmetic geometry is to give efficient al- 
gorithms for the calculation of the zeta function of a variety X over a finite field 
Fq, where q — p"". The zeta function of X is the generating function 



,n>l 



Dwork proved that Zx {T) is a rational function, so to compute it means to explic- 
itly find its numerator and denominator as polynomials. More background on the 
algorithmic theory of zeta functions may be found in the survey article |Wan08) . 

In this paper we focus on the specific case of a hyperelliptic curve X of genus 
g > I, with a rational Weierstrass point. Assuming p ^ 2, such a curve is given by 
an equation = Q{x) where Q G Fq[x] is monic and squarefree, of degree 2g + 1. 
The zeta function has the form 

' (l-T)(l-gr)' 

where P E Z[r] has degree 2g. 

In this situation, there are many algorithms known for computing Zx{T). One 
family derives from Schoof's algorithm for elliptic curves |Sch85[ IPilQOl lAHOl) . 
These £-adic algorithms achieve time complexity (logg)^ * ' , which for fixed genus 
is polynomial in logp and a, but in general is exponential in g. (In this paper, time 
complexity always means bit complexity in the sense of the multitape Turing model 
[Pap94| .) These algorithms have been successfully deployed in genus one and two 
— see |Sutl2| and |GS12j for recent record computations — but the author is aware 
of no attempts for g > 3. 

The p-adic algorithms form a much more diverse family. These all have the 
drawback that the complexity is exponential in log p. One example, highly rel- 
evant to the present work, is Kedlaya's algorithm [KedOlj . which has complexity 
pi+E^3+eg4+e ^ Here and below, X^ means X°^^\ where o(l) is a quantity approach- 
ing zero as A oo. The exponent of p can be improved to ^1/^+^ at the expense 
of increasing the exponents of a and g [Har07| . but this is still exponential in log p. 
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The main open problem in this area is whether there exists an algorithm whose 
complexity is simultaneously polynomial in g and logq. In other words, we ask for 
an algorithm whose complexity is polynomial in the size of the input. The latter is 
Q{g\ogq), the number of bits required to represent Q{x). 

In this paper we give a partial affirmative answer to this question. We consider 
the following situation. Let Q G Z[x] be a monic, squarefree polynomial of degree 
2g + 1. Let X be the projective closure of the affine curve = Q{x), so X 
is a hyperelliptic curve of genus g over Q. For any odd prime p not dividing the 
discriminant oiQ(x), let Xp be the reduction oi X modulo p, so Xp is a hyperelliptic 
curve of genus g over Fp. Let ||Q|| denote the maximum of the absolute values of 
the coefficients of Q. 

Theorem 1. Let N >3. Assume that \ogg = O(logiV) and \og\\Q\\ = O(logiV). 
There exists a deterministic algorithm that computes the zeta functions of Xp, for 
all odd primes p < N, with p not dividing the discriminant ofQ, simultaneously in 

bit operations. 

Since the number of primes p < N is asymptotically N/\ogN, the average time 
per prime is g^"^^ log'*"'"^ N, which is polynomial in the size of the input. 

One obvious application of this result is to the computation of L-series of hyper- 
elliptic curves over Q, with a view towards collecting numerical data on questions 
such as the Birch-Swinnerton-Dyer conjecture and the Sato-Tate conjecture for 
these curves. Such investigations have recently been carried out by Fite, Kedlaya, 
Rotger and Sutherland for curves of genus up to three [KS08[ iKSOOl IFKRS12| . with 
particularly detailed information being obtained for genus two curves. The new al- 
gorithm may make it possible to dramatically extend the range of their numerical 
results. 

In fact, even in the case of elliptic curves. Theorem [1] already yields the best 
known unconditional complexity bound for computing the trace of Frobenius for 
all p < N simultaneously. Previously, the best known unconditional deterministic 
bound was log^'''^ p per prime, achieved by Schoof 's original algorithm. The Schoof- 
Elkies-Atkin (SEA) algorithm is conjectured to improve this (probabilistically) to 
log^~''^p. For more information about the heuristics involved in the latter estimate, 
see the discussion preceding Theorem 13 of |Sutl2) . 

It is likely that this theorem can be extended in several ways. First, the re- 
striction to curves with a rational Weierstrass point is inherited from [KedOl] and 
[Har07j : it surely can be lifted, along the lines of |Harl2| . Second, the method 
should extend to superelliptic curves, following |GG011 IMinlO) . Whether it ex- 
tends to more general curves or higher-dimensional varieties is a more involved 
question, that we intend to address in a future paper. Third, it should be possible 
to apply the same method to a hyperelliptic curve defined over a number field K. 
There are several possible questions: one may ask to compute the zeta function of 
the reduction of X modulo p for all prime ideals p whose norm is bounded by N, 
or perhaps whose residue characteristic is bounded by N. The resulting complexity 
bound should depend polynomially on a = : Q], and also on the size of the 
coefficients of a defining polynomial for K/C^. 

The new algorithm has two main ingredients. The first is the author's modifica- 
tion of Kedlaya's algorithm |Har07j . The portion of this algorithm whose complexity 
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is exponential in logp involves computing various 'reduction matrices'. These are 
products of the form 

M{1)M{2)---M{p), 

where M{x) is a matrix of size 0{g) whose entries are linear polynomials in x 
over Zp. In that paper we suggested using the method of [BGS07) to evaluate this 
product using ring operations in Zp. 

A key observation is that such products may enjoy a certain redundancy: for 
pi < p2, the product M(l) ■ • • M{pi) may be a subproduct of M(l) • ■ • M{p2). To 
realise any advantage from this, we must overcome two obvious obstructions. 

The first is that the values lie in different rings; there is no relation between Qp^ 
and for pi P2- We will deal with this by evaluating the products over Q 
rather than Qp. It would appear that coefficient explosion renders this approach 
woefully inefficient. Coefficient growth does indeed occur, and one of our key tasks 
is to control it. 

The second obstruction is that the entries of M{x) might depend on p. This does 
in fact occur in the 'horizontal reductions' of [Har07| . via the dependence on t in 
[Har07[ §7.2]. The first clue towards removing this dependence is the observation 
that the 'vertical reduction' matrices of |Har07] do not depend on p. The difference 
is that these matrices 'reduce towards zero', in a sense that will be made clear 
in Section |4l Therefore our solution is to revisit the definition of the relevant 
cohomology spaces, and design a reduction strategy that 'reduces towards zero' 
in all cases. This leads to reduction matrices whose entries depend only on the 
coefficients of Q{x), and not on p. 

The second main ingredient is derived from recent work on the computation of 
Wilson quotients, or equivalently the residues Up = {p — 1)1 (mod p^). The best 
known algorithm for computing a single Up has complexity . For computing 

the Up in bulk, the paper |CGH12j introduced an "accumulating remainder tree" 
technique that computes Up for all p < N simultaneously in N log^^^ N bit oper- 
ations; that is, in average polynomial time per prime. In this paper we adapt this 
algorithm to the matrix case, replacing the linear polynomial x by the matrix M(x) 
discussed above, to compute the products M(l) • • • M{p), modulo an appropriate 
power of p, in average polynomial time per prime. 

While Theorem [T] is valid for all g > 1 , there is a less technical approach that 
may turn out to be more practical for small values of g. We give a brief sketch here, 
and leave the details to a separate paper. First consider an elliptic curve y"^ = Q{x). 
In this case the zeta function is determined modulo p by the coefficient of x^^^ in 
Let Un.k G Z denote the coefficient of x^ in (5(x)". The relations 
Q'^ = Q ■ Q"^^ and (Q")' = nQ' ■ Q^~^ imply certain recurrences for the Un,k- 
It turns out that, after some algebra, one can deduce a recurrence for the triple 
Tn — {Un.2n-2,Un,2n-i,Un,2n) G Z^ in tcrms of T„„i . This is another manifestation 
of 'reduction towards zero'. Applying the machinery of the accumulating remainder 
tree, one can then compute T(^p-i)/2 (mod p) for all p < in average polynomial 
time per prime. 

For general g > 1, this may be generalised to compute the Hasse-Witt matrix, 
and thus obtain Zx{T) modulo p. For g = 1 this is enough to determine Zx{T) 
completely (for sufficiently large p). For 5 = 2 it is almost enough; we get 0{1) 
candidates, which can then be checked by performing suitable group operations in 
the Jacobian of the curve. For 5 = 3 we obtain 0{p^^^) candidates, and then a 
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baby-step/giant-step search in the Jacobian recovers the zeta function in -p^l^^^ bit 
operations. This is probably still practical for the feasible range of -p. For 5 > 4 
this approach is probably no longer practical. 

2. Preliminaries 

For the rest of the paper we fix the following notation. We try to follow the 
notation of |Ked01| and jHarOTj as closely as possible, with additional decoration 
to keep track of the dependence on p. 

As in Theorem [Jl we take a hyperelliptic curve X given by the equation = 
Q{x) where Q G Z[x] is monic and squarefree, and degQ = 2(7 + 1. We denote by 
X' the curve obtained from X by removing the point at infinity and the Weierstrass 
points. It is affine, with coordinate ring 

Elements of A may be represented as finite sums 

/ = ^ aijx'y~\ ai.j G Q. 

Let be the A-module of differential forms on X' . This is the module generated 
by symbols du for u £ A, subject to the relations d{uv) — udv + v du for u,v £ A, 
and du = for m G Q. Since dy — ^Q'{x)dx/y, elements of ft may be represented 
as finite sums 

UJ = ^2 ai,jx'-y^^dx/y, ai,j e Q. 
i>o.jez 

Let be the (— l)-eigenspace for the hyperelliptic involution {x,y) {x, —y). Its 
elements are finite sums as above, with a.^j 7^ only for even j. 

Two forms uji,i02 G Q are cohomologous if oji — 0^2 = df for some f £ A, and in 
this case we write oJi ~ W2- Using the same method as in |Ked01| . it can be shown 
that every lu G ft~ is cohomologous to a unique to' = Y11=q^ Xix'^dx/y with G Q, 
called the reduction of cj. 

Now let p be an odd prime of good reduction for X, i.e. such that p does not 
divide the discriminant of Q. We denote by Xp and X^ the reductions of X and 
X' modulo p. Thus X^ has coordinate ring 

Ap = Fp[x,y,y-^]/{y^ -Qp{x)), 

where Qp G Fp[x] is the reduction of Q modulo p. Let 

Ap = Zp[x, y, y^'^]/{y^ - Qp{x)), 

where Qp G Zp[x] is the image of Q, and let Aj, be the weak completion of Ap, in the 
sense of Monsky-Washnitzer |MW68] . Define fip to be the Aj, -module of differential 
forms over Qp (i.e. generated by du for u G v4|, (S)Zp Qp, with the same relations as 
before), and let $7" be its (— l)-eigenspace. Two forms lji,uj2 G are cohomolo- 
gous if UJ1—UJ2 = df for some f £ Aj^ Qp- The quotient of ftp by this relation is 
by definition the first Monsky-Washnitzer cohomology group H^{X!p \ Qp), a vector 
space over Qp. We are mainly interested in = H^{Xp; Qp)~, the subspace cor- 
responding to n~ . It has dimension 2g, with basis {x'' dx / y}'^^Q^ . In other words, 
every cu G flp is cohomologous to a unique uj' — X^ifo^ XiX^dx/y with Ai G Qp, 
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again called the reduction of cj. The two notions of reduction are compatible with 
the obvious natural map — > 17^. 

Let ap : Ap Ap be the Frobenius map u t-^ u^. The essence of Kedlaya's 
method is to give an explicit expression for a lift <Tp : Aj^ Aj^, and then to 
calculate the matrix of its action on Vp with respect to the basis given above. 
The numerator P{T) of the zeta function of Xp is then simply the characteristic 
polynomial of this matrix. The Weil conjectures provide bounds on the coefhcients 
of this polynomial, so it can be recovered exactly, provided we compute the matrix 
to sufficiently high p-adic precision. 

Already here there is a subtle difference with jKedOl) . In Kedlaya's situation, 
the input is a curve over Fp, and he lifts it arbitrarily to Zp. In our case, we begin 
with a curve over Q, and we are considering the reductions modulo p for all p 
simultaneously. It is crucial for our method that we use the 'same lift' for all p. 

The precise definition of dp is not so important for us (see |Ked01| for details). 
The only information we need is the following description of the action of <7p on the 
basis elements x^dx/y: 

Proposition 2. Let fJ. > 1, and assume that p > (2/i — 1)(25 + 1). Let Cj^r G Z 
denote the coefficient of in Q{xy . For < j < ^, let 




For a,b> 1, with b odd, let Up''' denote the reduction of xP°- ^''^^dx/y E fl . 

Then for < i < 2g, the reduction of ap{x^dx/y) agrees modulo p^ with the 
image in fJp of 

^-1 (2s+l)j 

Proof. This is just a restatement of |Har071 Prop. 4.1], taking into account that 
reduction respects the map Q~ ^ 17". □ 

The point of this result is that to compute the zeta functions of Xp for many p 
simultaneously, it will suffice to compute, for finitely many pairs (a, 6), the reduc- 
tions of x^°'~^y~^'''^^dx/y, modulo a suitable power of p, for many p simultaneously. 
We will return to this in Section [SJ 

Note that the hypothesis p > (2/i — l)(2g + 1) is not stated explicitly in |Har07[ 
Prop. 4.1], but is a standing assumption for that whole paper; see |Har07[ Thm. 1.1]. 
The original purpose of this assumption was to simplify analysis of denominators. 
Indeed, the algorithm of |Har07| . and the statement of Proposition [2] above, can 
be modified to work for smaller primes, but this requires increasing the number 
of terms in the sum, and carrying more working p-adic digits in the algorithm. 
On the other hand, in the present paper, we are in effect forced to use the same 
p-adic precision for all primes. Therefore this hypothesis now acquires an efficiency 
implication: to get away with the minimum possible working precision, we must 
restrict to those primes p > (2/i — l){2g + 1). 

It will be important to keep track of the size of various objects in our discussion. 
For a polynomial / with integer coefhcients, define ]|/]| to be the maximum of the 
absolute values of its coefficients. If M is a matrix with integer entries, define 
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||Af|| ~ maxj i-S- the maximum of the L} norms of the colmTins of M. 

This norm is submuUiphcative with respect to matrix muhiphcation, because 

||A/A^|| < niax^^ |Af,;fe||A^fcj| < max^ jA^fc^lmax^ \M,s\ = ||A^||||M||. 

I k k I 

We wih freely use the foUowing weh-known complexity results. Integers with at 
most n bits may be multiplied in n log^^"^ n bit operations via fast Fourier transform 
methods, and division with remainder of integers with at most n bits has the same 
asymptotic cost. Matrices of size n over a ring R may be multiplied using O(n^) ring 
operations (but see the comments following the proof of Proposition!?]). We denote 
the set of such matrices by M„(i?). The primes less than N may be enumerated in 
iV log^'*''^ TV bit operations. Note that the usual complexity bound for the sieve of 
Eratosthenes is not valid in the Turing model; see |CGH121 Prop. 4] for a discussion 
and a proof of the bound given. 

We also require a deterministic algorithm for solving certain Bezout equations 
over 7i[x\. The literature on this problem focuses on probabilistic algorithms. For 
lack of a suitable reference, we provide the following result. Our method is quite 
standard; see for example |vzGG03] . 

Lemma 3. Let F^G G Z[a;] be nonzero and relatively prime. Let m = degF, 
n = degG. Let 6 G Z be the resultant of F and G, so S ^ 0. Then there exist 
polynomials Ri, Si G Z[x] , for < i < m + n, with the following properties. 

(a) FR, + GSi = Sx\ 

(b) deg Ri < n and deg Si < m. 

(c) log|(5|, log||i?i|| andlog\\S,\\ are all in 0{{m + n)\og{{m + n)\\F\\\\G\\)). 

(d) We may compute d, and all Ri and Si, in 

(m + nf+' hg^+'Hm + n)\\F\\ \\G\\) 

bit operations. 

Proof. Let denote the space of polynomials in Z[x] of degree less than k. Let T 
be the matrix of the map P„ x P,n — > Pm+n given by {R, S) i->- FR + GS, i.e. the 
(m + n) X (m + n) Sylvester matrix 

/Fo Go \ 

Fi Gi 

rp _ '■ ' ■ Fo '■ ■ • Go 

Fm Fi Gn Gi ' 

V Fjn Gn) 

where Fj and Gj denote the coefficients of F and G. By definition b = det T, and 
by Cramer's rule the coefficients of Ri and Si are given by certain principal minors 
of T. This proves (a) and (b), and (c) follows by applying the Hadamard bound to 
each determinant. 

We now sketch an algorithm that proves (d). We say that a prime p is 'bad' if it 
divides 5 or the leading coefficients of F or G; otherwise it is 'good'. The product of 
the bad primes is certainly at most |(5|||i^||||G||. By (c) we may choose /3 with /3 = 
0((m+n) log((m+n)||F||||G||)) so that we are guaranteed log max(|(5|, || i?i ||, IIS'jH) < 
/?. Increasing /? by log(|<5|||F||||G||) + 0(1) = 0((m + n) log((m + n)|lf'||||G||)), and 
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using the estimate X]p<^logp ~ /?, we may ensure that the product J of the 
good primes less than /3 is large enough so that knowledge of S, Ri , Si modulo J 
determines S, Ri, Si precisely over Z. 

Now perform the following steps. Compute the images of F and G in Fp[x] for 
allp < (3. This costs (m + ri)/3^+^ bit operations using a remainder tree |Ber08| . For 
eachp < /3, we may determine if p is good, and if so, find polynomials i?o, G Fp[x] 
such that FRo + GSo = 6 (mod p), deg Rq < n, deg So < m, in (m + log^"*"^ p 
bit operations |vzGG03| Thm. 11.7, Cor. 11.16]. For i = 1, . . . , m + n — 1 , compute 
Ri = xRi^i mod G and Si = xSi-i mod F, in (m + n)\og^~^'^ p bit operations. 
Then FRi + GSi = 5x^ (mod p) and deg Ri < n, deg Si < m. The cost over all 
i is (m + n)^ log^'^^^p, so over all p < /3 is (m + n)'^f3^'^'^ bit operations. Since 
T is nonsingular modulo the good primes, Ri and Si must agree modulo p with 
the polynomials Ri,Si constructed above. Finally we apply a fast interpolation 
algorithm [Be rOSj to each of the 0{{m + n)^) coefficients to reconstruct S and all 
Ri, Si in (m + n)^/3-'^+^ bit operations. □ 

Finally, we mention that we will omit any analysis of the costs of data rearrange- 
ment that must be counted in the Turing model; these are all subsumed within the 
arithmetic cost, along the same lines as the Appendix to |BGS07) . 

3. An ACCUMULATING REMAINDER TREE FOR MATRICES 

The following is a matrix generalisation of |CGH121 Theorem 1]. 

Proposition 4. Let n > 1, A > 1 and B > 2 be integers, and let t G R, t > 1. 

Assume that log A = 0(logi?) and logr = 0{logB). We are given as input a 
sequence of matrices Mq, Mi, . . . , Mb-i G M„(Z), with log||Afi|| < r for all i. 
Then we may compute 

MoMi---M(p_i)/2 (mod/) 
for all primes 3 < p < 2B simultaneously in 

n^{T + \)B\og^+^ B 

hit operations. 

Proof. Let t — \\0g2 B'] . We will construct several binary trees of depth i, whose 
nodes are indexed by the pairs {i,j) with Q < i < i and Q < j < 2'. The root node 
is (0, 0), the children of {i,j) are [i + 1, 2j) and (i + 1, 2j + 1), and the leaf nodes 
are for < j < 2^ 
For each node {i,j) let 

f^^j = |fceZ:jf <fc<(j + l)f }■ 

Thus C/i.o, . ■ . , C/i,2*-i partition the interval < k < B into 2* sets of roughly equal 
size. For < i < £ we have the disjoint union Uij — Ui+\^2j U C/i+i,2j+i- For the 
leaf nodes, we have |t/fj| = or 1 for every j, and for every < k < B, there is 
exactly one j such that Uij — {k}, namely j = \ 2^k/B\. 
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Now for each node define 

p^,, = n 

p prime 

^(p-i)et^.,j 

Aij = Y[ Mk+i, 
fee (7,, 3 

Ci J = Afo^i.o^i,! • • • (mod Pij), 

where for convenience we put Mb — I (the identity matrix) . Implicit in the product 
notation for Aij is that the Mfe are always multiplied in the correct left-to-right 
order, and that if Uij = then Ai^j = /. 

Note that the desired output may be recovered from the leaf nodes of the Cij 
tree. Indeed, suppose that 3 < p < 2B. Let k — i(p — 1), and choose j as above so 
that Uij = {k}. Then j = p^, and Cf,j = MqMi ■ ■ ■ Mt (mod p^). 

Now we explain how to compute the values in the trees, beginning with the P^j 
tree. After enumerating the primes less than 2B in Plog^"*"^ B bit operations, we 
use a standard product tree strategy [BerOSj , working from the bottom of the tree 
to the top, using the relation P^j- = Pi+i^2jPi+i,2j+i- To estimate the complexity, 
note that logPjj- — 0{Ni jX\ogB), where Nij is the number of primes in Uij, so 
each product costs NijXlogBlog^~^'^{NijXlogB) — Ni^jWo^'^^ B bit operations. 
Since J2j -^ij — 7r(2P) — 1 = 0(P/logP), the cost over all intervals at level i 
is PAlog^^'^P bit operations. Over all O(logP) levels of the tree, the cost is 
PA log"^"^ B bit operations. 

The Aij tree is computed in a similar manner. We have log < iC^i.jlT by 

submultiplicativity. Computing the product Aij — Ai j^i ^2jAi+i, 2 j+i requires 0(rt'^) 
multiplications of integers with 0{\Ui^j\T) bits, costing rt'\Ui^j\T\o^'^'^ {\Uij\T) bit 
operations. The total cost at level i is n^Pr log^"*"^ P, and the cost over all levels 
is n^Br log^"*"*^ B bit operations. 

For the Cij tree, we work from the top of the tree to the bottom, using the 
initial condition Cq.o = Mq (mod Po,o), and the relations 

Ci+i.2j = Cij (mod Pi+i^2j), 
Ci+i.2j+i — CijAi+i^2j (mod Pi+i^2j+i)- 

At each node we must perform divisions, and possibly n'^ multiplications, of 
integers with 0(max(|Pi,j|T, NijWogBj) bits. The final cost bound follows by the 
same argument as the previous paragraphs. □ 

There are several ways to improve the complexity bound in Proposition |4l at 
the expense of obfuscating the statement of the final result. One could of course 
substitute a faster matrix multiplication algorithm, such as Strassen's algorithm 
|Str69j ■ This would reduce the exponent of n, and hence the exponent of g in 
Theorem [TJ Another modification, more important in practice, is that one can 
multiply integer matrices by computing the Fourier transform of the entries, mul- 
tiplying the matrices of Fourier coefficients, and finally transforming back. The 
resulting complexity bound depends on what integer multiplication algorithm is 
being used. For m-bit matrix entries, roughly speaking we expect the complexity 
to drop from n'^mlog^^'^ m to n^m log^^'^ m + n^m. For small n and large m the 



COUNTING POINTS ON HYPERELLIPTIC CURVES IN AVERAGE POLYNOMIAL TIME 9 



first term dominates. This corresponds to small g and large N in Theorem [TJ and 
leads to a savings of a factor of 0{g) in Theorem [T] as N ^ oo. 

4. Reduction towards zero 

We now return to cohomology. Define a collection of Q-subspaces Wg^t C ft~ , 
for s > — 1 and t G Z, as follows. If s > 0, put 

Wg^t - {F{x)x'y-^'dx/y : F e Q[x],degF < 2g}. 

For s = —1, we use the same definition, but insist that the constant term of F{x) 
is zero, so that the expression F{x)x''y~'^*dx/y still defines an element of fl~ . 

Our goal in this section is to describe explicit reduction maps between the various 
Ws^t, that send differentials to cohomologous differentials. We will represent these 
maps by {2g + 1) x {2g + 1) matrices, acting on coordinate vectors with respect 
to the natural basis {x''y~^*dx/y, . . . ,x''^^^y~^*dx/y) for each Ws,t- In the case 
s — —1, the dimension is only 2g, but it will be convenient to represent elements of 
W-i^t as vectors of length 2g + 1, where it is understood that the first coordinate 
is always zero. The first row of any matrix mapping into such a space will always 
be zero. 

We will write 5 e Z for the discriminant of Q{x), or equivalently the resultant 
of Q{x) and Q'{x). It is nonzero because Q{x) is squarefree. The constant term cg 
of Q{x) will also play a special role; some of our results need to be stated slightly 
differently in the case that cq = 0. 

Our first result is algebraically the same as the 'horizontal reduction' discussed 
in |IIar07[ Prop. 5.4]. However, we now treat both s and t as variables, and we 
must analyse coefficient growth, as we are working over Q rather than Qp. 

Proposition 5 (Horizontal reduction). Let 

Dh{s, t) = {2g + l){2t - 1) - 2s e Z[s, t]. 

There exists a matrix Mh £ Af2g+i(Z[s, t]) with the following properties. 

(a) Let s > 0, t E Z. Then DH{s,t) ^ 0, and the map DH{s,t)^^ MH{s,t) 
sends a differential uj G Ws^t to a cohomologous differential in Ws-i^t- 

(b) The entries of Mh have degree at most 1 . 

(c) logllAfell =0(log(5||g||)). 

(d) Mh may he computed in .9 log^~''^(.g||(5||) hit operations. 
Proof. Using the relations Q{x) = y^ and Q'{x)dx — 2ydy, we have 

d{x'y-^*+^) = sx''^y~^*+^dx - {2t - l)x'y-^'dy 
(1) = (^sQ{x) - i(2t - l)xQ'{x)^ x'-^y-^'dx/y. 

Let Q{x) = x'^^'^^ + P{x), where P G Z[x] has degree at most 2g. After substituting 
this into the previous equation and rearranging, we obtain 

F>H[S,t) 
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We may therefore take 
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where Ci = Ci{s, t) is the coefScient of in the polynomial 2sP{x) — {2t~l)xP'{x). 
Note that DH{s,t) is nonzero for s,t e Z because it only assumes odd values. 

The bound for ||A/^f || follows from the estimate ||P'|| < 2g||P||. The complexity 
bound covers 0{g) multiplications of integers with 0(log ||Q||) bits by integers with 
0(log5)bits. □ 

Next we give a generalisation of the 'vertical reduction' of |Har07| Prop. 5.1], 
which was a map W-i^t W^-i,t-i- It turns out that the most natural generalisa- 
tion yields a map Ws,t — > Ws-i,t-i rather than Ws.t Ws,t-i- (The discrepancy 
is resolved by reinterpreting the vertical reduction of |Har07] as a map from a 
codimension 1 subspace of Wo,t to VF-i,t-i.) 

Proposition 6 (Diagonal reduction). Let 

Doit) = 2t - 1 e Z[i], 

There exists a matrix Mu G M2g+i(Z[s, t]) with the following properties. 

(a) Let s > 0, t e Z. Then the map Dr){t)~^ Mjj(s,t) sends a differential 
u) £ Ws.t to a cohomologous differential in Wg-i.t-i- 

(b) The entries of Mu have degree at most 1 . 

(c) \og\5\ anrflog||Mu|| are both in 0[g\og{g\\Q\\)). 

(d) 6 and Me, may he computed in g'^^^ log^^'^((7||(3||) bit operations. 

Proof. According to Lemma [Sj for each < i < 2g, there exist Ri, Si € Z[a;], with 
degi?i < 2g — 1 and degS'^ < 2g, such that 

6x' ^ R^{x)Q{x) + S^{x)Q'{x). 

This implies that 

dx"+'y-^'dx/y = x'R,{x)Q{x)y-^'dx/y + x' S^{x)Q' {x)y'^^dx/y 
= x'R,{x)y-^'+^dx/y + 2x' S^{x)y-^Uy. 

Since 

d{x' S^{x)y-^'+^) - {x'S^{x)yy-'''+^dx + {-2t + l)x' S,{x)y-^'dy, 
after some algebra we obtain the relation in cohomology 
(2) x^^^y-^'dx/y ^ (^^ - l)xR.{x)+^2^sS^{x) + 2xgK^) ^.-1^-2^+2^^^^^ 

According to this formula, we may take Md to be the matrix whose (i + l)-th 
column consists of the coefficients of {2t — l)xRi(x) + 2sSi{x) + 2xS[{x). These 
coefficients are clearly of degree at most 1 in s and i, and Dult) is nonzero because 
2t — 1 is odd. This proves (a) and (b), and (c) and (d) follow from Lemma [3l □ 



We will also need a genuine 'vertical reduction' in the generic case cq 7^ 0: 
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Proposition 7 (Vertical reduction). Assume that cq / 0. Let 

Dv{t) = 2i - 1 e Z[t]. 
There exists a matrix My £ M2g+i{Z[s,t]) with the following properties. 

(a) Let s > 0, t E Z. Then the map {cQS)^^Dv{t)~^Mv{s,t) sends a differen- 
tial uj e Ws,t to a cohomologous differential in Ws,t-i- 

(b) The entries of My have degree at most 1 . 

(c) log||My||=0(5log(g||g||)). 

(d) My may be computed in g'^^^ iog^^'^ {g\\Q\\) bit operations. 

Proof. We continue tlie calculation of Proposition [51 Write Si{x) = hi + xTi{x), 
where hi G Z, Ti £ Z[x], degT^ <2g — 1. The right hand side of ([2]) becomes 

{'^h^sx'-^ + ((2i - l)R^{x) + 2sT,{x) + 2Slix))x'') y-^'+^dx/y. 

Our goal is now to reduce the x^^^y^'^*^'^dx/y term 'to the right'. Write Q{x) = 
cg + xP{x), where P G Z[x], deg P < 2g. Replacing i by t — 1 in ([1]), we obtain 

2sQ{x)x''-^y-^*+'^dx/y - {2t - 3)Q'{x)x'y-^'+^dx/y, 

so 

2sx'~'y~^'+^dx/y ^ ^21^-391M^llE^^Sy-2t+2^^iy_ 

Co 

Combining everything, we finally have 

x'+'y-^^dx/y - 

{2t - 3)h,Q' - 2h,sP + (2t ~ l)coR, + 2cosT, + 2coSi , _2t+2 , / 
{2t - l)<5co ^ ^ 

The columns of My are obtained from the numerator of this expression in the same 
way as in the proof of Proposition [51 □ 

The next result has no analogue in |Har07| . For each a and 6, it will allow us 
to reduce the forms x'P°'^^y~P^^^ dx / y of Proposition [5] along the same reduction 
path, for many p simultaneously. 

We say that a pair of integers (a, h) is admissible if the following conditions hold: 

(i) a, & > 1 and b is odd; 

(ii) if Co — 0, then b < 2a; 

(iii) a = 0{g^) and b = 0{g). 



Proposition 8 (Reduction towards zero). Let (a, 6) be an admissible pair, and let 
r > 1. There exists c 
following properties. 



r > 1. There exists a matrix M!^'^ £ Af2g+i(Z) and a nonzero integer D"^'^ with the 



(a) The map {D^'^y^M^'" sends a differential 

W^a(2r+l)-l,i(6(2r+l)-l) 

to a cohomologous differential in 

^a(2r-l)-l,i(6(2r-l)-l)- 

(b) log 1 1 Af^"''' 1 1 anrflogllL'^'i are in 0{g'^\og{gr\\Q\\)). 

(c) M°'^ and Z?"''' may be computed in g^'^'^ \o^^'^ {gr\\Q\\) bit operations. 
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Proof. Our goal is to reduce along the vector (—2a, —6) in the (s,t)-plane. We 
consider two cases. 

First suppose that b < 2a. Then we may construct the required map by perform- 
ing b diagonal reductions (Proposition |6]) followed by 2a — 6 horizontal reductions 
(Proposition [5]). More precisely, let 

So = a(2r + 1) - 1, 
fo = i(fo(2r + l)-l), 
si = So - 6 = a(2r + 1) - 6 - 1, 
h^to-b^^{b{2r-l)-l), 
S2 = Si - (2a - 5) = a(2r - 1) - 1, 

t2-ii = i(6(2r-l)-l). 
These all have absolute value in 0{ar). Let 

M' = Md{so - b+l,to~b+l)--- Md{sq - 1, to - 1)Md(so, ^o), 
D' = S^'Doito - 6 + 1) • • ■ Doito - l)DDito), 
M" ^ Mh{si ~ 2a + b + l,h) ■ ■ ■ Mh{si - l,h)MH{si,h), 
D" = D„{si -2a + b + l,ti)--- D„{si - 1, ti}DH{si,h). 

Then {D')-Hl' maps T^^o.to to Ws,,t,, and {D")-^M" maps Ws,,t, to VF,,,f,. For 
(a) we should therefore take the composition 

j^a,b ^ M"M', D^''' = D"D', 

so that {D^''')-^M^''> maps Wso,to to Ws^^t2- 

To prove (c), note that for each < j < 6, we have ||A^d(so — j, to — — 
0{ar\\MD\\). Similarly, \\Mh{si - j,ti)\\ = 0(ar|| A-f^ ||) for < j < 2a - b. Thus 

logllM;-"!! = 0(61og(ar||Afe||) + (2a - 6) log(ar|| A/hID) = 0(5' log(5r||g||)). 

A similar argument yields log = 0{g^ log((7r||Q||)). 

For (d), we may compute D' and D" , and hence Z?^'*", using a product tree; the 
complexity is soft-linear in the number of bits of output, which is 0{g^ log(grl|(5||)). 
The same result holds for M^-'', with an additional factor of 0{g^) to account 
for the matrix multiplications. Therefore we obtain the bit complexity bound 
(^5+"^ log^+^((;r||(5||). This bound also incorporates the invocations of Propositions 
Oandil 

Now consider the case b > 2a. By hypothesis we may assume that cq 7^ 0, so 
that vertical reductions (Proposition[71) are permissible. We proceed by performing 
2a diagonal reductions followed by & — 2a vertical reductions. In other words, we 
put 

So = a(2r + 1) - 1, 
to = i(6(2r + l)-l), 
si — So — 2a = a(2r — 1) — 1, 
ti = <o - 2a = 5(6(2r + 1) - 1) - 2a, 
S2 = si = a(2r - 1) - 1, 
t2-ti-(6-2a) = i(6(2r-l)-l). 
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and 

M' = Md{so - 2a+ l,io - 2a+ 1) • • •Md(so - l,io - l)Mzj(so, io), 
D' = S'^'^Doito - 2a + 1) • • . Doito - l)i?D(io), 
M" = Mv{siM - b + 2a + I) ■ ■ ■ Mv{si,ti - l)My(si,ti), 
D" = (co(5)''-2'^i:'y(ti -fe + 2a + l)---i:>y(ti - l)I?y(ti), 

and M^^'' = M"M', D^-^ As before, (L>^'*)-iM^''^'' maps Ws^m to VF^,,*,, 

and the required bounds for fog ||M^'''|| and fog and the complexity bounds, 

foUow in the same way. □ 

Iterating the previous result enables us to reduce to W^a_i.i(b_i)- The next result 
finishes the job, giving the final reduction to VF_i.o- 

Proposition 9 (Final reduction). Let {a,b) be an admissible pair. There exists a 
matrix A/q''' G A/2g+i(Z) and a nonzero integer D^ '' with the following properties. 

(a) The map {Dq''')^^Mq''' sends a differential uj in W^_i ^(fc-i) to a cohomol- 
ogous differential in W-ifi. 

(b) logJIK'"!! =0{5^1og(5||Q||)) andloglD^o-'l-OigHogigWQW)). 

(c) A/q' and Dq' may be computed in g^^^ \og^^^ {g\\Q\\) bit operations. 

Proof. If 6 < 2a, we perform i(6 — 1) diagonal reductions followed by a — i(6 — 1) 
horizontal reductions. If 6 > 2a, we perform ^(6— 1) —a vertical reductions followed 
by a diagonal reductions. We omit the details, which are essentially the same as in 
the proof of Proposition [5] □ 

5. The main algorithm 

Recall the forms Up'^ introduced in Proposition [2j We say that a pair {a,b) is 
p-admissible if it satisfies the following conditions: 

(i) a,b> 1 and b is odd; 

(ii) if p divides cq, then 6 < 2a; 
(in) a = 0(g2) and b = 0{g); 

(iv) p does not divide 6; 

(v) p> (25 + 1)6 + 2a. 

Note that p- admissibility implies admissibility. 

Proposition 10. Let {a,b) be admissible, and let v > 1, N > 3. Assume that 
V = 0(5^), logg — 0(log A^) and log ||(3|| = 0(log A^). Then we may compute 11^''' 
modulo p", .simultaneously for all those p < N .such that (a, b) is p-admissible, in 

p5+^iV log^+^ A^ 

bit operations. 

Proof. We will systematically omit the superscripts (a, b) for clarity. We may as- 
sume that A^ is even, and put B = N/2. Let Mq, . . . , Mb-i and Dq, . . . , Db-i be 
as in Propositions |8] and [9l Then the matrix 

Jp = {Dq ■ ■ • -D(p-1)/2)"HMo • ■ • A'/(p_i)/2) 
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maps Wjjp.j 1 cohomologously to W-i.o- The form x"-P^^y^''P^^dx/y is rep- 

resented by the vector (1,0,..., 0) in the source space, so the coordinates of Up are 
given by the first column of Jp. 

To obtain results correct modulo p'', we must bound the p-adic valuation of 
-Do • • • -D(p_i)/2. First consider the contributions from the vertical and diagonal 
reductions. Our hypotheses ensure that the 5 and co terms do not contribute. 
What remains is the factor 2i — 1 for t = 1, 2, . . . , \ {bp — 1). The only such integers 
divisible by p are p, 3p, . . . , {b — 2)p. Since p > b, the valuation contributed is exactly 
(&-l)/2. 

Now consider the horizontal reductions. If 6 > 2a then no horizontal reductions 
are performed, so we may assume that b < 2a. We must analyse the p-adic valuation 
of (2(7 + l){2t — 1) — 2s for a certain sequence of pairs (s, t). For all these pairs we 
havet < 5(6p-l) and s < ap-1, so |(2g-|- l)(2i - 1) - 2s| < p((25 + l)& + 2a) <p2. 
Therefore {2g + l){2t — 1) — 2s cannot be divisible by p^, so it suffices to bound the 
number of factors (2g + l){2t — 1) — 2s that are divisible by p. The pairs coming 
from the proof of Proposition [8] are s = a(2r + 1) — b—l—j and t = ^{b{2r — 1) — 1) 
for 1 < r < (p — l)/2 and < j < 2a — b. For these s and t we have 

(2g + 1)(2< - 1) - 2s ^ 2((2.g + 1)6 - 2a)r - {{2g + 1)(6 + 2) + 2(a - 6 - 1 - j)). 

Since |(2(7+ 1)6 — 2a\ is odd and less than p, the coefficient of r is nonzero modulo p. 
Therefore for each j, the factor {2g+ l)(2f — 1) — 2s is divisible by p for at most one 
value of r. The pairs coming from Proposition[S]are t = and < s < o— 1 — ^(6—1). 
For these pairs we have \{2g + l){2t — 1) — 2s| < 2g + 1 + 2a < p, so they do not 
contribute any p-adic valuation. 

We conclude that Vp{Do ■ ■ • D{p-i)/2) < P where p = ^(6 — 1) + max(0, 2a — 6). 
(The 'vertical' component of this bound is sharp, but the 'horizontal' piece may be 
too generous by a constant factor. For practical computations it would be important 
to find the optimal bound, but it does not affect our main asymptotic result.) 

We apply Proposition |4] with X = v + p to compute the products 

£'o----D(p_i)/2 (modp^), Mo---M(p_i)/2 (mod p^) 

for all p < N. By the above discussion, their ratio yields Jp, and hence Up, correctly 
modulo p"^, for those p such that (a, 6) is p-admissible. 

Now we analyse the complexity. The contribution from Proposition [S] and [S] is 

g'^+'^N log^+^ig N\\Q\\) = g^+^N\og^+' N. 

To estimate the contribution from Proposition|4l we may take t = max^ log ||Mr|| = 
0(52log(.giV||Q||)) = 0(g2log(iV||Q||)). Certainly logr = 0(log7V) and log A = 
O (log TV). Thus the cost of Proposition |4] is 

g"" (gHogig N WQW) + g'')N log'+' N ^ g'N log'+' N. □ 

Finally we may prove the main theorem. 

Proof of Theorem]^ According to [KedOlj . the Weil conjectures imply that for each 
p it suffices to compute the Frobenius matrix modulo p^^ where pLp > g/2 + {2g + 
1) logp 2. Therefore the bound p = \g/2 + {2g + 1) logg 2] works uniformly for all 
p. Note that /i — 0{g). 
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Consider the terms appearing in the main sum in Proposition [2] The corre- 
sponding values of a and 6 satisfy 

l<a^i + r + l<{2g-l) + {2g+ l)Oi - 1) + 1 = (2^ + 1)^ - 1 

and 

1 < 6 = 2j + 1 < 2/^ - 1. 

In particular a = 0{g^) and b = 0{g). 

The definition of p-admissiblity requires that p > (2g+l)6 + 2a, and Proposition 
[2]requires that p > {2g + 1)(2^ - 1). Since {2g + 1)6 + 2a < (2.g + 1)(4^ - 1), we 
must first handle separately those p < M where M = {2g + l)(4/i — 1) = 0{g^). 
This can be done using (for example) Kedlaya's algorithm for each such p. The 
complexity is p^'^'^ g'^'^'^ per prime, and there are 0{g^) such primes, so the total is 
g^+^. 

Now we use Proposition [10] to compute Up''' (mod p^), for all pairs (a, b) corre- 
sponding to terms appearing in Proposition[2J First consider the case cq — 0. Then 
we have Cj^r — for r < j, so the relevant pairs are those for which 1 < 6 < 2/j, — 1, 
&odd, and ^(b+l) <a< {2g + + 1) -1. There are 0{g^) such pairs. All these 
pairs are admissible, and they are also p-admissible for all primes M < p < N of 
good reduction. The hypotheses of Proposition [TOl are satisfied, and we obtain Up'^ 
(mod p'^), for all desired p, in f/*+^iV log^^"^ N bit operations. 

Next consider the case cq ^ 0. The inequality for a becomes 1 < a < {2g + l){j + 
1) — 1, and the corresponding pairs are p-admissible for all primes M < p < of 
good reduction, except those dividing cq. Thus PropositionlTOlvields Up'^ (mod p^) 
for all desired primes except those dividing cq. We suggest two different methods 
to deal with the remaining primes. 

One possibility is to run the entire algorithm again, replacing Q{x) by Quix) = 
Q{x) — Co + ucqx for a suitable integer u. Then Qu{x) has zero constant term, 
and agrees with Q{x) modulo Cq, so the zeta functions at the primes dividing cq 
are the same. The only restriction on u is that Qu{x) must be squarefree. This is 
easily assured by computing the discriminant of Q„(x) as a polynomial in m; it has 
degree 0{g), so has at most 0{g) roots, and we may easily find a non-root u with 
u = 0{g). In this case we still have log ||Qti|| = O(logiV), so the complexity bound 
is still valid. 

Another possibility is to use Kedlaya's algorithm to fill in the missing primes. 
The difficulty is that a suitably sharp complexity bound for Kedlaya's algorithm, as 
a function of p, does not seem to appear in the literature. We assert here, without 
proof, that the complexity is g'^^'^plo^'^^ p. (Briefly, this arises from the cost 
of multiplying polynomials of degree 0{p) with coefficients having Oilogp) bits.) 
The number of missing primes is at most 0(log |co|) = 0(log \\Q\\) = OilogN), so 
the contribution from the missing primes is g'^^^N log^^^ N , which fits within our 
desired complexity bound. 

At this stage we have computed Up'^ (mod p^), for all relevant pairs (a, b), and 
for all primes M <p < N oi good reduction. The final step is to evaluate the main 
sum in Proposition [21 and compute the characteristic polynomial of the resulting 
matrix, for each p. We will show that this can be achieved in g^^^Xog^^^ p bit 
operations per prime, or g^^^Nlog^^^ N bit operations altogether. 

We know that Vp{Up'^) > —p, where p = 0{g^) is defined as in PropositionfTUl so 
to evaluate the sum we must work at a p-adic precision of fi + p digits. (Numerical 
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evidence suggests that in fact pUp '' is always p-integral for these primes. A proof 
can probably be given along the lines of |Ked011 Lemma 2], but we do not need 
this here.) 

Wc may compute all the aj (mod p^^'') by a straightforward algorithm, using 
0{g^) ring operations (i.e. operations modulo p^"''''), and all the Cj^r (mod p''^'') 
in g^'^'^ ring operations. Then for each < i < 2g, we may evaluate the main sum 
in 0{g^) ring operations, to obtain the reduction Ti oi ap{x^dx/y) modulo p^. Note 
that the Ti are integral (see for example the proof of |Har07[ Prop. 4.1]). The total 
cost is 0{g^) ring operations, or 17 ^^"^ log p bit operations. 

Let T G M2g{'Z/p^Z) be the matrix whose columns are given by the T^; we must 
compute its characteristic polynomial. We sketch a simple deterministic algorithm 
for this that avoids divisions by p. Compute the powers T, T^, . . . , T^^. Their traces 
are the power sums of the eigenvalues of T. Newton's identities may be used to 
deduce the elementary symmetric polynomials in these eigenvalues, and thus the 
coefficients of the characteristic polynomial. This requires 0{g^) ring operations, 
including a single division by each of the integers 2, 3, . . . , 2(7, all of which are less 
than p. The total complexity is g^'^'^ log^^^p bit operations. □ 
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